Characterization of Information Channels for 
Asymptotic Mean Stationarity and Stochastic 
Stability of Non- stationary/Unstable Linear Systems 
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Abstract — Stabilization of non-stationary linear systems over 
noisy communication channels is considered. Stochastically stable 
sources, and unstable but noise-free or bounded-noise systems 
have been extensively studied in information theory and con- 
trol theory literature since 1970s, with a renewed interest in 
the past decade. There have also been studies on non-causal 
and causal coding of unstable/non-stationary linear Gaussian 
sources. In this paper, tight necessary and sufficient conditions 
for stochastic stabilizability of unstable (non-stationary) possi- 
bly multi-dimensional linear systems driven by Gaussian noise 
over discrete channels (possibly with memory and feedback) 
are presented. Stochastic stability notions include recurrence, 
asymptotic mean stationarity and sample path ergodicity, and 
the existence of finite second moments. Our constructive proof 
uses random-time state-dependent stochastic drift criteria for 
stabilization of Markov chains. For asymptotic mean stationarity 
(and thus sample path ergodicity), it is sufficient that the capacity 
of a channel is (strictly) greater than the sum of the logarithms of 
the unstable pole magnitudes for memoryless channels and a class 
of channels with memory. This condition is also necessary under 
a mild technical condition. Sufficient conditions for the existence 
of finite average second moments for such systems driven by 
unbounded noise are provided. 

Keywords: Stochastic stability, asymptotic mean stationar- 
ity, non-asymptotic information theory, Markov chains, stochas- 
tic control, feedback. 



I. Problem Formulation 

This paper considers stochastic stabilization of linear sys- 
tems controlled or estimated over discrete noisy channels with 
feedback. We consider first a scalar LTI discrete-time system 
(we consider multi-dimensional systems in Section |IV1 > de- 
scribed by 



x t +i = ax t + bu t + d t 



t > 



(1) 



Here x t is the state at time t, u t is the control input, the initial 
condition xq is a second order random variable, and {d t } is 
a sequence of zero-mean independent, identically distributed 
(i.i.d.) Gaussian random variables. It is assumed that \a\ > 1 
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and 6^0: The system is open-loop unstable, but it is stabi- 
lizable. 

This system is connected over a Discrete Noisy Channel 
with a finite capacity to a controller, as shown in Figure Q] 

The controller has access to the information it has received 
through the channel. The controller in our model estimates the 
state and then applies its control. 

Remark 1.1: We note that the existence of the control can 
also be regarded as an estimation correction, and all results 
regarding stability may equivalently be viewed as the stability 
of the estimation error. Thus, the two problems are identical 
for such a controllable system and the reader unfamiliar with 
control theory can simply replace the stability of the state, 
with the stability of the estimation error. 

Recall the following definitions. 

Definition 1.1: A finite-alphabet channel with memory is 
characterized by a sequence of finite input alphabets A4 n+1 , 
finite output alphabets M' n+1 , and a sequence of conditional 
probability measures P n {q[ „i |<7[o,n]), from M n+1 x M' n+1 



to 



with, 



?[0,n] -'■ Wo> ?!)••• Q[0,n] : ~ Uo, Ql, ■ • <?«}■ 

Definition 1.2: A Discrete Memoryless Channel (DMC) is 
characterized by a finite input alphabet M, a finite output al- 
phabet M', and a conditional probability mass function P(q'\q), 
from M x M' to K. Let q[ Q . n ] £ M n+1 be a sequence 
of input symbols, and let g' , £ M.' n+1 be a sequence of 
output symbols, where qk £ M and q' k £ M! for all k. Let 
PdmC denote the joint mass function on the n + 1-tuple input 
and output spaces. A DMC from M n+1 to M' n+1 satisfies 
the following: PnM C W[o, n p ?[o,«]) = lTLo p dmc (<?[., qk), 
V<7[o, n ] S M n+1 , q' [Q n] £ M' n+1 , where q k ,q' k denote the Mi 
component of the vectors q[o, n ],q[o „i> respectively. □ 

In the problem considered, a source coder maps the infor- 
mation at the encoder to corresponding channel inputs. This is 
done through quantization and a channel encoder. The quan- 
tizer outputs are transmitted through a channel, after being 
subjected to a channel encoder. The receiver has access to 
noisy versions of the quantizer/coder outputs for each time, 
which we denote by q' t £ M! . The quantizer and the source 
coder policy is causal such that the channel input at time t > 0, 
qt, is generated using the information vector If available at 
the encoder for t > 0: 

It = {I t s _ l5 £ t ,gt-i,^_i}, 
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and Jq = {vq,xq\, where vq is the probability measure for 
the initial state. 

The control policy at time t, also causal, is measurable on 
the sigma-algebra generated by Ijr, for t > 1: 

i? = {iU,i'th 

and Zq = {^o}i an d is a mapping to R. 

We will call such coding and control policies admissible 
policies. 
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Fig. 1: Control over a discrete noisy channel with feedback. 

The goal of the paper is to identify conditions on the chan- 
nel under which the controlled process {xt} is stochastically 
stable in sense that {x t } is recurrent, {x t } is asymptotically 
mean stationary and satisfies Birkhoff's sample path ergodic 
theorem, and that limx->oo h Y^t=o \\ x t\\ 2 i s finite almost 
surely, under admissible coding and control policies. We will 
make these notions and the contributions of the paper more 
precise after we discuss a literature review in the next section. 
The appendix in Section I VII I contains a review of relevant 
definitions and results on stochastic stability of Markov chains 
and ergodic theory. 

Here is a brief summary of the paper. In the following, 
we will first provide a comprehensive literature review. In 
Section II, we state the main results of the paper. In Section 
III, we consider extensions to channels with memory, and in 
Section IV, we consider multi-dimensional settings. Section 
V contains the proofs of necessity and sufficiency results for 
stochastic stabilization. Section VI contains concluding re- 
marks and discusses a number of extensions. The paper ends 
with an appendix, in Section I VIII which contains a review 
of stochastic stability of Markov chains and a discussion on 
ergodic processes. 

A. Literature Review 

There is a large literature on stochastic stabilization of sources 
via coding, both in the information theory and control theory 
communities. 

In the information theory literature, stochastic stability re- 
sults are established mostly for stationary sources, which are 
already in some appropriate sense stable sources. In this lit- 
erature, the stability of the estimation errors as well as the 
encoder state processes are studied. These systems mainly 
involve causal and non-causal coding (block coding, as well 
as sliding-block coding) of stationary sources l53l . [29 1, |48|, 
and asymptotically mean stationary sources 051 . Real-time 
settings such as sigma-delta quantization schemes have also 
been considered in the literature, see for example P6l among 
others. 



There also have been important contributions on non-causal 
coding of non-stationary/unstable sources: Consider the fol- 
lowing Gaussian AR process: 

tn 

x t = -^a k x t - k + w k , (2) 
fe=i 

where {wk} is an independent and identical, zero-mean, Gaus- 
sian random sequence with variance E[w 2 ] — a 2 . If the roots 
of the polynomial: V[z) := 1 + Y^T=i a k z ~ k are a ll i n the 
interior of the unit circle, then the process is stationary and its 
rate distortion function (with the distortion being the expected, 
normalized Euclidean error) is given parametrically (in terms 
of parameter 9) by the following Kolmogorov's formula 1541 
1 30 1, obtained by considering the asymptotic distribution of 
the eigenvalues of the correlation matrix: 

i r i 

Dg = — min(0, — r ^)dw, 
2ir J_ v g(w) 

R{De) = ^ / max(i(log -^—), 0)dw, 

with g(w) = + J2k=i a k e ~ tkw \ 2 - ^ at l east one root, 
however, is on or outside the unit circle, the analysis is more 
involved as the asymptotic eigenvalue distribution contains 
unbounded components. [40|, [ 30 1 and |34| showed that, using 
the properties of the eigenvalues as well as Jensen's formula 
for integrations along the unit circle, R(De) above should be 
replaced with: 

R(D 6 ) =^£-a,Qlog(^L y)) o)^ 

m 1 / \ 
+ J2 2 max 0,log(| Pfc | 2 )), (3) 

k=l ^ ' 

where {pk} ^e the roots of the polynomial V. We refer the 
reader to a review in P4l regarding rate-distortion results for 
such non-stationary processes and on the methods used in QUI 
and PDl . 

Reference obtained the rate-distortion function for Wiener 
processes, and in addition, developed a two-part coding scheme, 
which was later generalized for more general processes in 1751 
and [78], which we will discuss below further, to unstable 
Markov processes. The scheme in [6] exploits the independent 
increment property of Wiener processes. 

Thus, an important finding in the above literature is that, 
the logarithms of the unstable poles in such linear systems 
appear in the rate-distortion formulations, an issue which has 
also been observed in the networked control literature, which 
we will discuss further below. We also wish to emphasize that 
these coding schemes are non-causal, that is the encoder has 
access to the entire ensemble before the encoding begins. 

In contrast with information theory, due to the practical 
motivation of sensitivity to delay, the control theory literature 
has mainly considered causal/zero-delay coding for unstable 
(or non-stationary) sources, in the context of networked control 
systems. In the following, we will provide a discussion on the 
contributions in the literature which are contextually close to 
our paper. 
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Ifl2l studied the trade-off between delay and reliability, and 
posed questions leading to an accelerated pace of research 
efforts on what we today know as networked control prob- 
lems. References B31l . j85|, and [66 1 obtained the minimum 
lower bound needed for stabilization over noisy channels under 
a class of assumptions on the system noise and channels. 
This result states that for stabilizability under information con- 
straints, in the mean-square sense, a minimum rate needed for 
stabilizability has to be at least the sum of the logarithms of 
the unstable poles/eigenvalues in the system; that is: 




Comparing this result with (0), we observe that, the rate re- 
quirement is not due to causality but to the (differential) en- 
tropy rate of the unstable system. 

For coding and information transmission for unstable linear 
systems, there is an important difference between continuous 
alphabet and finite-alphabet (discrete) channels as discussed 
in |95l : When the space is continuous alphabet, we do not 
necessarily need to consider adaptation in the encoders. On 
the other hand, when the channel is finite alphabet, and the 
system is driven by unbounded noise, a static quantizer leads 
to almost sure instability (see Proposition 5.1 in [66 1 and 
Theorem 4.2 in ll95ll ). With this observation, l66l considered a 
class of variable rate quantizer policies for such unstable linear 
systems driven by noise, with unbounded support set for its 
probability measure, controlled over noiseless channels, and 
obtained necessary and sufficient conditions for the bounded- 
ness of the following expression 

lim supi?[||a; t || 2 ] < oo. 

t— >oo 

With fixed rate, reference 1971 obtained a somewhat stronger 
expression and established a limit 

lim M||a; t || 2 ] < oo, 

t— >oo 

and obtained a scheme which made the state process and the 
encoder process stochastically stable in the sense that the joint 
process is a positive Harris recurrent Markov chain and the 
sample path ergodic theorem is applicable. 

Reference [56| established that when a channel is present 
in a controlled linear system, under stationarity assumptions, 
the rate requirement in is necessary for having finite sec- 
ond moments for the state variable. A related argument was 
made in [95] under the assumption of invariance conditions 
for the controlled state process under memoryless policies 
and finite second moments. In this paper, in Theorem 14.11 
we will present a very general result along this direction for a 
general class of channels and a weaker stability notion. Such 
settings were further considered in the literature. The problem 
of control over noisy channels has been considered in many 
publications including 0, (85), EE), (53, El, (59), E3, 
[84 1 among others. Many of the constructive results involve 
Gaussian channels, or erasure channels (some modeled as infi- 
nite capacity erasure channels as in ll79l and [42]). Other works 
have considered cases where there is either no disturbance or 



that the disturbance is bounded, with regard to noisy sources 
and noisy channels. We discuss some of these in the following. 

It is to be stressed that, the notion of stochastic stability is 
very important in characterizing the conditions on the channel. 
References |60l , ll59l considered stabilization in the following 
sense, when the system noise is bounded: 

lim sup \xt\ < oo a.s., 

t— >oo 

and observed that one needs the zero-error capacity (with 
feedback) to be greater than a particular lower bound. A sim- 
ilar observation was made in 1781 . which we will discuss 
further in the following. When the system is driven by noise 
which admits a probability measure with unbounded support, 
the stability requirement above is impossible for an infinite 
horizon problem, even when the system is open-loop stable, 
since for any bound, there exists almost surely a realization 
of a noise variable which will be larger. 

References l77l . ll78il considered systems driven by bounded 
noise and considered a number of stability criteria: Almost 
sure stability for noise-free systems, moment stability for sys- 
tems with bounded noise (lim sup^^ -E[|xt| p ] < oo) as well 
as stability in probability (defined in [59]) for systems with 
bounded noise. Stability in probability is defined as follows: 
For every p > 0, there exists a £ such that P(|xt| > £) < p 
for all t € N. 117711 and ll78l also offered a novel and in- 
sightful characterization for reliability for controlling unstable 
processes, named, any-time capacity, as the characterization 
of channels for which the following criterion can be satisfied: 

lim sup^[|a;t| p ] < oo, 

for positive moments p. A channel is a— any-time reliable 
for a sequential coding scheme if: P(rh t ~ d (t) ^ m t ~ d (t)) < 
K2~ ad for all t,d. Here m t ~ d is the message transmitted at 
time t — d, estimated at time t. One interesting aspect of an 
any-time decoder is the independence from the delay, with a 
fixed encoder policy. 1781 states that for a system driven by 
bounded noise, stabilization is possible if the maximum rate 
for which an any-time reliability of 2 log 2 ( I Pi I ) is satisfied, 
is greater than log 2 (|pi|), where p\ is the unstable pole of a 
linear system. 

In a related context, El, (78), (59) and ED considered the 
relevance to Shannon capacity. If55ll observed that when the 
moment coefficient goes to zero, Shannon capacity provides 
the right characterization on whether a channel is sufficient 
or insufficient, when noise is bounded. A parallel argument 
is provided by ll78l . in Section III.C.l, observing that in the 
limit when p 0, capacity should be the right measure for the 
objective of satisfying stability in probability. Their discussion 
was for bounded noise signals. [59] also observed a parallel 
discussion, again for bounded noise signals. 

With a departure from the bounded noise assumption, ll58ll 
extended the discussion in |78] and studied a more general 
model of multi-dimensional systems driven by an unbounded 
noise process considering again stability in probability. Il58ll 
also showed that when the discrete noisy channel has capacity 
less than log 2 (|a|), where a is defined in (Q~|), there exists no 
stabilizing scheme, and if the capacity is strictly greater than 
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this number, there exists a stabilizing scheme in the sense of 
stability in probability. 

Many network applications and networked control applica- 
tions require the access of control and sensor information to 
be observed intermittently. Toward generating a solution for 
such problems, [94] and 11961 developed random-time state- 
dependent drift conditions leading to the existence of an invari- 
ant distribution possibly with moment constraints, extending 
the earlier deterministic state dependent results in ll63l . Using 
drift arguments, 119511 considered noisy (both discrete and con- 
tinuous alphabet) channels, ||97l considered noiseless channels 
and 1 94 1 considered erasure channels for the following stability 
criteria: The existence of an invariant distribution, and the 
existence of an invariant distribution with finite moments. 

References [24], ll56l and [57 1 considered general channels 
(possibly with memory), and with a connection with Jensen's 
formula and Bode's sensitivity integral, developed necessary 
and sufficient rates for stabilization under various networked 
control settings. Reference [65] considered erasure channels 
and obtained necessary and sufficient time-varying rate condi- 
tions for control over such channels. Reference ifTTl considered 
second moment stability over a class of Markov channels with 
feedback and developed necessary and sufficient conditions, 
for systems driven by an unbounded noise. Reference [38 1 
considered the stochastic stability over erasure channels, par- 
allel to the results of ||94l . 

On the other hand, for more traditional information theo- 
retic settings where the source is revealed at the beginning 
of transmission, and for cases where causality and delay are 
not important, the separation principle for source and channel 
coding results are applicable for ergodic sources and informa- 
tion stable channels. The separation principle for more general 
setups has been considered in [89|, among others. References 
[92 1 and [91 1 studied the optimal causal coding problem over 
respectively a noiseless channel and a noisy channel with 
noiseless feedback. Unknown sources have been considered 
in lfl5ll . We also note that, when noise is bounded, binning 
based strategies, inspired from Wyner-Ziv and Slepian-Wolf 
coding schemes are applicable. This type of consideration has 
been applied in |f78l . Ir95l and 11371 . Finally quantizer design 
for noiseless or bounded noise systems include [26], [25 1 and 
[43 1 . Channel coding algorithms for control systems have been 
presented recently in [83), ED and EQ. 

There also has been progress on coding for noisy channels 
for the transmission of sources with memory. Due to practical 
relevance, for communication of sources with memory over 
channels, particular emphasis has been placed on Gaussian 
channels with feedback. For such channels, the fact that real- 
time linear schemes are rate-distortion achieving has been ob- 
served in 0, 11281 : and in a control theoretic context. 
Aside from such results (which involve matching between 
rate-distortion achieving test channels and capacity achieving 
source distributions 11281 ). capacity is known not to be a good 
measure of information reliability for channels for real-time 
(zero-delay or delay sensitive) control and estimation prob- 
lems, see ll9ll and [78]. Such general aspects of comparison 
of channels for cost minimization have been investigated in 
[8] among others. 



Also in the information theory literature, performance of 
information transmission schemes for channels with feedback 
has been a recurring avenue of research in information theory, 
for both variable length and fixed length coding schemes BTI . 
[14], [76 1, [22]. In such setups, the source comes from a fixed 
alphabet, except the sequential setup in [76] and El . 

B. Contributions of the Paper 

In view of the discussion above, the paper makes the fol- 
lowing contributions. The question: When does a linear sys- 
tem driven by unbounded noise, controlled over a channel 
(possibly with memory) satisfy Birkhoff's sample path ergodic 
theorem (or is asymptotically mean stationary)?, has not been 
answered to our knowledge. Also, the finite moment con- 
ditions for an arbitrary discrete memoryless channel for a 
system driven by unbounded noise have not been investigated 
to our knowledge, except for the bounded noise analysis in 
[78 1. The contributions of the paper are on the two problems 
stated above. In this paper, we will show that, the results 
in the literature can be strengthened to asymptotic mean sta- 
tionarity and ergodicity. As a consequence of Kac's Lemma 
[19 1, stability in probability can also be established. We will 
also consider conditions for finite second moment stability. 
We will use the random-time state-dependent drift approach 
||94l to prove our achievability results. Hence, we will find 
classes of channels under which the controlled process {xt} 
is stochastically stable in in each of the following senses: 

• {x t } is recurrent: There exists a compact set A such that 
{x t G A} infinitely often almost surely. 

• {xt} is asymptotically mean stationary and satisfies Birkhoff's 
sample path ergodic theorem. We will establish that Shan- 
non capacity provides a boundary point in the space of 
channels on whether this objective can be realized or not, 
provided a mild technical condition holds. 

• lirriT^oo h ^2t=o \\ x t || 2 exists and is finite almost surely. 

II. Stochastic Stabilization over a DMC 

A. Asymptotic Mean Stationarity and n— ergodicity 

Theorem 2.1: For a controlled linear source given in ([TJ 
over a DMC under any admissible coding and controller pol- 
icy, to satisfy the AMS property under the following condition 

lim inf — h(xT) < 0, 

T->oo T 

the channel capacity C must satisfy 

c > io g2 (H). 

□ 

Proof: See the proof of Theorem 14.11 in Section IV-AI 
Remark 2.1: The condition lim inf 7^00 ^h(xx) < is a 
very weak condition. For example a stochastic process whose 
second moment grows subexponentially in time such that 

liminf iQ g (j?[4D =0 

satisfies this condition. 
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The above condition is almost sufficient as well, as we state 
in the following. 

Theorem 2.2: For the existence of a compact coordinate 
recurrent set (see Definition 17. 4t , the following is sufficient: 
The channel capacity C satisfies: C > log 2 (|a|). □ 



Proof: See Section IV^Cl 



□ 



For the proof, we consider the following update algorithm. 
The algorithm and its variations have been used in source 
coding and networked control literature: See for example the 
earlier papers [29|, |48|, and more recent ones lfl3l . Il66ll . [60|, 
ISO, 120, H3 and ll98l . Our contribution is primarily on the 
stability analysis. 

Let n be a given block length. Consider the following setup. 
We will consider a class of uniform quantizers, defined by two 
parameters, with bin size A > 0, and an even number K(n) > 
2 (see Figure |2). The uniform quantizer map is defined as 
follows: For k = 1,2 ... , K(n), 



' K 



(»)(») = 



'(fc-i(if(n) + l))A, 

if x e [(k - 1 - \K{n))A, (k - |Jf(n))A) 
(|(tf(n) - 1))A, if x=\K{n)A 
Z, if |x| > \K{n)A. 



where Z denotes the overflow symbol in the quantizer. We 



define {x : Q^( n ) ( x ) ^ ^} to ^ e me granular region of the 



quantizer. 



Bio Size 



Overflow bin 


















Overflow bio 























Fig. 2: A uniform quantizer with a single overflow bin. 

At every sampling instant t = kn,k = 0,1,2,..., the 
source coder £ t s quantizes output symbols in 1 U {Z} to a 
set M(n) = {1, 2, . . . , K(n) + 1}. A channel encoder £ t c 
maps the elements in Ai(n) to corresponding channel inputs 

?[fcn,(fe+l)n-l] G M n . 

For each time t = kn — 1, k = 1,2,3,..., the channel 
decoder applies a mapping V tn : A4' n — > Ai(n), such that 



c (fc+l)ri-l — 'E > kn(q[kn,(k+l)n-l] 

Finally, the controller runs an estimator: 

Xkn = i^kn) ( C (fc+l)n-l) X ^{ 



)• 



+0 x 1 {C 



where 1 e denotes the indicator function for event E. Hence, 
when the decoder output is the overflow symbol, the estimation 
output is 0. 

We consider quantizers that are adaptive. In our setup, the 
bin size of the uniform quantizer acts as the state of the 
quantizer. At time kn the bin size, Afe„, is assumed to be a 
function of the previous state A(j._ 1 ) n and the past n channel 
outputs. We assume that the encoder has access to the previous 



channel outputs. Thus, such a quantizer is implementable at 
both the encoder and the decoder. 

With K(n) > [>ri> R = log 2 (if(n) + 1), let us define 
R'{n) = \og 2 {K(n)) and let 

i?'(n)>nlog 2 (M), 
a 

for some a, < a < 1 and S > 0. When clear from the 
context, we will drop the index n in R'(n). 

We will consider the following update rules in the controller 
actions and the quantizers. For t > and with Aq > L for 
some L G K + , and xq G R, consider: For t = kn, k G N 



_ i ' 

Ut — — ±{t=(k+l)n-l}-^- x kn, 
A(k+l)n = ^fenQ(Afc„,c' (fc+1)rl _ 1 ), 



(5) 



where c' denotes the decoder output variable. If we use 5 > 
and L > such that, 



Q(A,c') = (\a\ + S) n 
Q(A,c') = a n 
Q(A,c') = l 



if d = Z, 

if c'^Z,A>L, 

if d ^ Z, A < L, (6) 



we will show that a recurrent set exists. The above imply that 
A t > La n =: V for all t>0. 

Thus, we have three main events: When the decoder output 
is the overflow symbol, the quantizer is zoomed out (with 
a coefficient of (|a| + S) n ). When the decoder output is not 
the overflow symbol Z, the quantizer is zoomed in (with a 
coefficient of a") if the current bin size is greater than or 
equal to L, and otherwise the bin size does not change. 

We will establish our stability result through random-time 
stochastic drift criterion of Theorem 17.21 developed in |94| 
and 1961 . This is because of the fact that, the quantizer helps 
reduce the uncertainty on the system state only when the state 
is in the granular region of the quantizer. The times when the 
state is in this region are random. The reader is referred to 
Section IVlI-BI in the appendix for a detailed discussion on the 
drift criteria. 

In the following, we make the quantizer bin size process 
space countable and as a result establish the irreducibility of 
the sampled process (x tn , A t „). 

Theorem 2.3: For an adaptive quantizer satisfying Theo- 
rem 12.21 suppose that the quantizer bin sizes are such that 
their logarithms are integer multiples of some scalar s, and 
log 2 (0(")) takes values in integer multiples of s. Suppose 
the integers taken are relatively prime (that is they share no 
common divisors except for 1). Then the sampled process 
(xtn, A tn ) forms a positive Harris recurrent Markov chain at 
sampling times on the space of admissible quantizer bins and 
state values. □ 



Proof: See Section N-D\ 



□ 



Theorem 2.4: Under the conditions of Theorems 12.21 and 
12.31 the process {xt, A t } is n-stationary, n-ergodic and hence 
AMS. " □ 



Proof: See Section IV^El 



□ 
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The proof follows from the observation that a positive Harris 
recurrent Markov chain is ergodic. It uses the property that if 
a sampled process is a positive Harris recurrent Markov chain, 
and if the intersampling time is fixed, with a time-homogenous 
update in the inter-sampling times, then the process is mixing, 
n— ergodic and n— stationary. 

B. Finite Second Moment 

In this section, we discuss finite moment stability. Such 
an objective is important in applications. In control theory, 
quadratic cost functions are the most popular ones for lin- 
ear and Gaussian systems. Furthermore, the two-part coding 
scheme of Berger in [5] can be generalized for more general 
unstable systems if one can prove finite moment boundedness 
of sampled end-points. 

For a given coding scheme with block-length n and a mes- 
sage set M{n) = {1,2,..., K{n) + 1}, and a decoding func- 
tion 7 : M' n -> {1,2,..., K(n) + 1} define three types of 
errors: 

• Type I-A: Error from a granular symbol to another gran- 
ular symbol. We define a bound for such errors. Define 

P° lg (n) to be 

c£ Sw\/ (7( ^ 4l) ^ c >7(<7'o,„-i]) + Z\c), 

where conditioning on c means that the symbol c is trans- 
mitted. 

• Type I-B: Error from a granular symbol to Z: We define 
the following. 

p g\ g ( n ) '■= max P(7(8[o,n-i]) = Z \ c ) 

MIM ceM(n)\Z 1 ■ ' 

• Type II: Error from Z to a granular symbol: 

PglzH := P(7(<zfo,„-i]) * Z\Z) 

Type II error will be shown to be crucial in the analysis 
of the error exponent. Type I-A and I-B will be important 
for establishing the drift properties. We summarize our results 
below. 

Theorem 2.5: A sufficient condition for second moment 
stability (for the joint (xt, A t ) process) over a discrete mem- 
oryless channel (DMC) is that: 

lim (ilog(Pl| ff (n)) + 21og(|a|+5) < 0, 
lim (k- log(f"U(n)) + 21og(|a| + S) < 0, 
lim (k- log(P e , (n)) + 21og(|a| + S) + 2 K log(a) < 0, 

n—toc 77, y|y 

R'{n) > nlog 2 (|a|/a) 

with arbitrarily small, positive n > and 

1 

K < g . 

logw+s(™^) 

□ 

Proof: See Section [VT] □ 



Let us define 

PJri) := max PHlaL „_n) ¥= clc is transmitted). 

When the block-length is clear from the context, we drop the 
index n. We have the following corollary to Theorem 12.51 

Corollary 2.1: A sufficient condition for second moment 
stability (for the joint (xt, A t ) process) over a discrete mem- 
oryless channel (DMC) is that: 

lim (K-log(P e (n))+21og(|a|+<5) < 0, 

n— >oo Ti 

with rate R'{n) > n\og 2 (^j-)- □ 

Remark 2.2: For a DMC with block length n, Shannon's 
random coding ||27ll satisfies: 

Pe(n) < e -«S(fl)+o(n) ) 

uniformly for all codewords c € {1, 2, . . . , Ai(n)} with c' 
being the decoder output (thus, the random exponent also 
applies uniformly over the set). Here _> o as n — >• cxd 
and E(R) > for < R < C. Thus, under the above 
conditions, the exponent under random coding should satisfy 
E(R) > 21 °g2(l a l+ <5 ) 

Remark 2.3: The error exponent with feedback is typically 
improved with feedback, unlike capacity of DMCs. However, 
a precise solution to the error exponent problem of fixed length 
block coding with noiseless feedback is not known. Some par- 
tial results have been reported in 1211 (in particular, the sphere 
packing bound is optimal for a class of symmetric channels for 
rates above a critical number even with feedback), Chapter 10 
of d, H, El, (23], HI, HO) and EH. Particularly related 
to this section, [ 10 1 has considered the exponent maximization 
for a special message symbol, at rates close to capacity. At the 
end of the paper, a discussion for variable length coding, in the 
context of Burnashev's lfT4l setup, will be discussed along with 
some other contributions in the literature. In case feedback 
is not used, Gilbert exponent ||69l for low-rate regions may 
provide better bounds than the random coding exponent. □ 

1) Zero-Error Transmission for Z: An important practical 
setup would be the case when Z is transmitted with no error 
and is not confused with messages from the granular messages. 
We state this as follows. 

Assumption AO We have that P|| g («) = Pg\z( n ) = f° r 
n > no for some no € N. □ 

Theorem 2.6: Under Assumption AO, a sufficient condi- 
tion for second moment stability is: 

lim (P e (n))(\a\ + S) 2n < 1, 

71— >0O 

with rate R'{n) > n\og 2 (^) and k > 1/2. □ 

Proof: See Section □ 

Remark 2.4: The result under (AO) is related to the notion 
of any-time capacity proposed by Sahai and Mitter [78 1. We 
note that Sahai and Mitter considered also a block-coding 
setup, for the case when the noise is bounded, and were able 
to obtain a similar rate/reliability criterion as above. It is worth 
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emphasizing that, the reliability for sending one symbol Z for 
the under-zoom phase allows an improvement in the reliability 
requirements drastically. □ 



III. Channels with Memory 

Definition 3.1: Let Class A be the set of channels which 
satisfy the following two properties: 
a) The following Markov chain condition holds: 



q't <-> Qt,q[o,t-i],q{o,t-i] <-* •'' 



[0,4]) 



for all t > 0. 

b) The channel capacity with feedback is given by: 

C= l™ m . max , n s;-f(9[o,T-i] ->■ «fo,T-i]). (7) 

{P(g t |g [0 , f _i] ,9[ ,t-i] )> J 

where < i < T — 1 and the directed mutual information is 
defined by 



I(Q[0,t-i] -* Q[o,t-i]) 

T-l 

■= I (i[o,t];q'tW[ ,t-i]) + I (io;q'o) 



Discrete memoryless channels (DMCs) naturally belong to 
this class of channels. For such channels, it is known that 
feedback does not increase the capacity. Such a class also 
includes finite state stationary Markov channels which are in- 
decomposable [72 1, and non-Markov channels which satisfies 
certain symmetry properties [82|. Further examples are studied 
in El and in [|20l . 

Theorem 3.1: For a linear system controlled over a noisy 
channel with memory with feedback in Class A, if the channel 
capacity is less than log 2 (|a|), then the AMS property under 
the following condition 

lim inf —Ii(xt) < 0, 

T->oo T 

cannot be satisfied under any policy. □ 

Proof: See the proof of Theorem 14.11 in Section IV-AI □ 

The proof of the above is presented in Section IV-Ah . If the 
channel is not information stable, then information spectrum 
methods leads to pessimistic realizations of capacity (known 
as the lim inf in probability of the normalized information 
density, see Ir90]l , |[87l ). We do not consider such channels in 
this paper, although the proof is generalizable to some cases 
when the channel state is Markov and the worst case initial 
input state is considered as in IT721 . 

1 One can also obtain a positive result: If the channel capacity is greater than 
log 2 (|a|) then there exists a coding scheme leading to an AMS state process 
provided that the channel restarts itself with the sending of a new block. If this 
assumption does not hold, then, using the proofs in the paper we can prove 
coordinate-recurrence under this condition. For the AMS property, however, 
new tools will be required. Our proof would have to be modified to account 
for the non-Markovian nature of the sampled state and quantizer process. 



IV. Higher Order Sources 

The proposed technique is also applicable to a class of 
settings for the multi-dimensional setup. Observe that a higher 
order ARMA model of the form (O can be written in the 
following form: 



Xt+i = Ax t + Bu t + Gd t 



(8) 



where xt G is the state at time t, u t is the control input, 
and {d t } is a sequence of zero-mean independent, identically 
distributed (i.i.d.) zero-mean Gaussian random vectors of ap- 
propriate dimensions. Here A is the system matrix with at 
least one eigenvalue greater than 1 in magnitude, that is, the 
system is open-loop unstable. Furthermore, (A, B) and (A, G) 
are controllable pairs, that is, the state process can be traced 
in finite time from any point in M. N to any other point in at 
most N time stages, by either the controller, or the Gaussian 
noise process. 

In the following we assume that all modes with eigenvalues 
{Aj, 1 < i < n) of A are unstable, that is have magnitudes 
greater than or equal to 1. There is no loss here since if 
some eigenvalues are stable, by a similarity transformation, the 
unstable modes can be decoupled from stable modes; stable 
modes are already recurrent. 

Theorem 4.1: Consider a multi-dimensional linear system 
with unstable eigenvalues, that is |A<| > 1 for i = 1, . . . , N. 
For such a system controlled over a noisy channel with mem- 
ory with feedback in Class A, if the channel capacity satisfies 

C<5>g 2 (|A,;|), 

i 

there does not exist a stabilizing coding and control scheme 
with the property lim infx->oo ^h(xx) < 0. □ 

Proof: See Section IV^Al □ 

For sufficiency, we will assume that A is a diagonalizable 
matrix with real eigenvalues (a sufficient condition being that 
the poles of the system are distinct real). In this case, the 
analysis follows from the discussion for scalar systems; as the 
identical recurrence analysis for the scalar case is applicable 
for each of the subsystems along each of the eigenvectors. The 
possibly correlated noise components will lead to the recur- 
rence analysis discussed earlier. We thus have the following 
result. 

Theorem 4.2: Consider a multi-dimensional system with 
a diagonalizable matrix A. If the Shannon capacity of the 
(DMC) channel used in the controlled system satisfies 

C> £ bgadAil), 

Ai|>l 

there exists a stabilizing (in the AMS sense) scheme. □ 

Proof: See Section N-H\ □ 

The result can be extended to a case where the matrix A 
is in a Jordan form. Such an extension entails considerable 
details in the analysis for the stopping time computations and 
has not been included in the paper. A discussion for the special 
case of discrete noiseless channels is contained in |46l in the 
context of decentralized linear systems. 
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V. Proofs 



A. Proof of Theorem \4.1\ 

We present the proof for a multi-dimensional system since 
this case is more general. For channels under Class A (which 
includes the discrete memoryless channels (DMC) as a special 
case), the capacity is given by (0. 

Let us define 



Rrp '■= max 

m<? t i9[o, t -i],9; 0it _ 1] ),o<t<T-i} 



T-l 



t=0 



Observe that for t > 0: 



%t;9[o,t]l2[ ,t-i]) 

= H (Qt\Q[o,t-i]) - H Wtho,t],q[ ,t-i]) 

= H (ltW[o.t-i]) ~ H {(fM[o,t^ x t^%t-i)) ( 9 ) 

> H (Qt\Q[o,t-i]) ~ H (lt\ x t,q' [0tt -i]) 

= I(xt;q't\q[ a ,t-i])- ( 10 ) 



= limsup ^[Yl (^^t-ikfo.t-i]) 

-Hxt\q' m )\ +I(x ;q^ (13) 
1 / T_1 / 

= y lo &2(\ A \) +M#t-ik[o,t-i]) 

= lo S2(l^l)) + ^(^oko) 

-?i(a;T-i|?[o,T-i]) + I (xo;qo) 

= lo g 2 (l^l) - U ™^£ f ^M^T-lkfo.T-l])^ 



> lo g 2 (l^l) - I™ inf I ^h(x T -i) 

T— >oo \ 1 



(14) 



Here, © follows from the assumption that the channel is of 
Class A. It follows that since for two sequences such that 
a-n > b n ; lim sup„ a n > lim sup ra b n and Rt is assumed to 
have a limit: 



lim Rt 

T->oc 

> lim sup- I ( x uq'tW[o.t-i])) +i(xo;q' ) 

T^oo 1 \ t=1 

lim supi( Y [H x t\q[o,t-i]) - h ( x tW[o,t\) 
- t=l \ 

+I(xo;q' ) 

I f T - 1 f 

lim sup — ( [ h i Ax t-i + Gdt-i + Bu t -x\q[ Q<t _-^ 
t=i \ 



> log 2 (|A|). 

Equation ( ITTb follows from the fact that the control action 
is a function of the past channel outputs, ( fT2] i follows from 
the fact that conditioning does not increase entropy, and ( TT3b 
from the observation that {dt} is an independent process. 
(TBI follows from conditioning. The other equations follow 
from the properties of mutual information. By the hypothe- 
sis, lim inft^oo \h{xt) < 0, it must be that lim.T_>oo Rt > 
log 2 (|A|). Thus, the capacity also needs to satisfy this bound. 

□ 

B. Stopping Time Analysis 

This section presents an important supporting result on stop- 
ping time distributions, which is key in the application of 
Theorem l7.2l for the stochastic stability results. We begin with 
the following. 

Lemma 5.1: Let B(K x K + ) denote the Borel a— field on 
K x 1 + . It follows that 



P[ (x kn ,A kn ) G (C x D)\{(x sn ,A sn ),s < k} 



h (xt\q{ 0>t ]) +I(xo;q' ) 



T-l 



lim sup ^ ( Y (h(Ax t -i + Gdt-i\q{ ^ x] 



K x tW[o,t\) ) + I (xo;q'o) 



(11) 



T-l 



> lim sup ^ ( ^ ( h(Ax t -\ + Gdt-x\q[ 0>t _i^dt-\) 



h (xt\q' [0 , t ]) +I(xo;q' ) 



T->oo 



T-l 



= lim sup I- ( Y [H Ax t-i\q'io,t-i] > d t-i) 



h (xt\q' l0: t]) ) +I(xo;q'o) 



= P\(xkn, A kn ) G (C x D)\(x (k _ 1)n ,A {k _ 1)n )^, 

V(C x D) G B(R x M+), i.e. (x tn , A tn ) is a Markov chain. 

□ 

The above follows from the observations that, the channel 
is memoryless, the encoding only depends on the most recent 
samples of the state and the quantizer, and the control poli- 
cies use the channel outputs received in the last block, which 
stochastically only depend on the parameters in the previous 
block. 

(12) Let us define h t :— . ^r'-x - We will say that the quan- 
tizer is perfectly zoomed when \h t \ < 1, and under-zoomed 
otherwise. 

Define a sequence of stopping times for the perfect-zoom 
case with (where the initial state is perfectly zoomed at r ) 

tq = 0, t z+ i = vni{kn > t z : \h kn \ < 1}, z, k G Z + (15) 
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As discussed in Section III-B1 there will be three types of 
errors. 

• Type I-A: Error from a granular symbol to another gran- 
ular symbol. In this case the quantizer will zoom in, yet 
an incorrect control action will be applied. As in Section 
III-BI Pg\ g { n ) is an upper bound for such an error. 

• Type I-B: Error from a granular symbol to Z. In this 
case, no control action will be applied and the quantizer 
will be zoomed out. As in Section Hl-BI P z \ g (n) is an 
upper bound for such an error. 

• Type II: Error from Z to a granular symbol. At consecu- 
tive time stages, until the next stopping time, the quantizer 
should ideally zoom out. Hence, this error may take place 
in subsequent time stages (since at time the quantizer 
is zoomed, this does not take place). The consequence of 
such an error is that the quantizer will be zoomed in and 
an incorrect control action will be applied. Let 

Pe{n) ^P e g \ z (n) 

= P (7(g[ , n _i]) Z\Z is transmitted) 

We will drop the dependency on n, when the block length is 
clear from the context. 

Lemma 5.2: The discrete probability distribution P(t z+ i — 
r z \x Tz , A Tz ) is asymptotically, in the limit of large A Tx , domi- 
nated (majorized) by a geometrically distributed measure. That 
is, for k > [1/k] + 1, 

P{t z +i -t z > kn\x Ts! , A T J 

< S(A T j((l - P; |fl - P zlg )(eP^) k - 2 

,( K -fcf hfc-2 



<PzJ( eP < 



(16) 



where H(A Tji ) < oo and E(A Ta ) — > 1 as A Tz 
fixed n, uniformly in \ho\ < 1 and 



K < 



1 



l0g^l+£(^) 



oo for every 



(17) 



□ 



Proof: 

Denote for k e N, 



P(t z+ i - t z > kn\x Tz ,A Tz ). 



(18) 



Without any loss, let z = and tq = 0, so that 0fc = P(t\ > 
kn\x , Ao). 

Before proceeding with the proof, we highlight the technical 
difficulties that will arise when the quantizer is in the under- 
zoom phase. As elaborated on above, the errors at time are 
crucial for obtaining the error bounds: At time 0, at most 
with probability P^ g (n), an error will take place so that the 
quantizer will be zoomed in, yet an incorrect control signal 
will be applied. With probability at most P z \ g {n), an error 
will take place so that, no control action is applied and the 
quantizer is zoomed out. At consecutive time stages, until the 
next stopping time, the quantizer should ideally zoom out but 



an error takes place with probability P g \ z {n) and leads the 
quantizer to be zoomed in, and a control action to be applied. 
Our analysis below will address all of these issues. 

In the following we will assume that the probabilities are 
conditioned on particular xq, Ao values, to ease the notational 
presentation. 

We first consider the case where there is an intra-granular, 
Type I-A, error at time 0, which takes place at most with 
probability P^ g (this happens to be the worst case error for 
the stopping time distribution). Now, 

P{t\ > fcnjType I-A error at time 0) 

, fc-i 

= P[ P| (\h, nn \ > l)|Type I-A error at time 

^ m— 1 
fe-1 



p( n ^ 2 



R'-Xi 



\a\ + S) 



m — s m — l)n 



xa( 1+Sm »"A )j 



k-l 



P[ f|(| a m "(a;o+ J2 o-^te+Ui))! 

.=1 t=0 

> 2 R '- 1 (\a\ +< 5)(™- ;i ™- 1 )™a (1+Sm) ™Ao) 



k-l 



= P f| (K Xo+ E a-'-^di+u^l 



> 



2 R '- 1 a n ,\a\ +5 



( q_^ )(m -l)n ( « )(sm) n Ao) 
o| \a\ + o 



(19) 



In the above, s m is the number of errors in the transmissions 
that have taken place until (but not including) time m, except 
for the one at time 0. An error at time would possibly lead to 
a further enlargement of the bin size with non-zero probability; 
whereas no-error at time leads to a strict decrease in the bin 
size. 

The study for the number of errors is crucial for analyzing 
the stopping time distributions. In the following, we will condi- 
tion on the number of erroneous transmissions for k successive 
block codings for the under-zoomed phase. Suppose that for 
k > 1, there are Sk total erroneous transmissions in the time 
stages {n, 2n, . . . , (k — l)n} when the state is in fact under- 
zoomed, but the controller interprets the received transmission 
as a successful one. Thus, we take S\ = 0. 

Let Ci, £2, • • ■ , Cs fc _! be the time stages when errors take 
place, such that 



Ct+i ■ min(min(m > ( t ■ c' nm ^ c nm ), k - 



1), Co=0, 
— 1 and define 



such that Cs fc _!+i = k - 1 or C Sfc _! = 
Vt = Ct+i - Ct- 

In the time interval [Qn + 1, Ct+in — 1] the system is open- 
loop, that is there is no control signal applied, as there is 
no misinterpretation by the controller. However, there will 
be a non-zero control signal at times {Ckn, k > 0}. These 
are, however, upper bounded by the ranges of the quantiz- 
ers at the corresponding time stages. That is, when an erro- 
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neous interpretation by the controller arises, the control ap- This follows from considering a conservative configuration 



plied — (a n /6)ii(^ +1 )„_ 1 lives in the set: {a"(— 2 R 1 + k — 
(l/2))A fal l<fc< 2*'}. 

From ( [T9l >, we obtain d20b . Regarding ( f20b . let us now ob- 
serve the following: 



among an increasing subsequence of times . . . , Cm}, such 
that for all elements of this sequence: 



f) 6 



( x ° + X! a 1 ldt 



> 



a + 5\ 

( lgl±j )(C»-i-«»-i)n 



(sfc-i)n 



E« 



i=0 



(-Ci-l)r. 



*(6+l)n-l 



and for every element up until time to, (£ m — Q — (s T , 



8j) (log (!»!+<) (^ffi) > «a(s 

a| 

quence provides a conservative configuration which yet satis 
fies 



sA. Such an ordered se- 



> 2 



R'-l 



(\a\ + 5)(C»-«»-i)«Qt(i+»»)«Ao^ 



< P 



E ■ 



> 2 



|a| 

m— 1 



-i)n/ °_-j(i+s m )n^ o f or n j ar g e enough 
Id 



by considering if needed, to to be an element in the 
sequence with a lower index value for which this is satisfied. 
This has to hold at least for one time £ m , since ft — 1 satis- 
fies 124V Such a construction ensures that (|25|) is uniformly 
bounded from below for every k since Y^iLi(^\^')~ aAm < 1 



E 



HC»+l)n-l 



Hence, by l24l >. for some constant Bj, > 0, the following 
holds: 



Since the control signal u/^ 4+ x)n-i li ves m: {»"(— 2 fl _1 + 
ft — (l/2))A/£.) n , 1 < ft < 2" }, conditioned on having Sfc_i 
errors in the transmissions, the bound writes as ( l22l . where 
d = YliLo a ~ % ~ X di is a zero-mean Gaussian random variable 
with variance E \ d _ ^ a _ 2 . Here, (|2TI ) considers the worst case 



PMrf| > B b A Q [ (- 
< 2- 



) U ( 



(|a|+*)' 



\(»m+l)t 



P b A (H±£)C™ ( « )(-™ + D 



J±£n< 



) m "( ( | a ° + a)) 



/2</ 2 



(27) 



when even when the quantizer is zoomed, the controller in- 
correctly picks the worst case control signal and the chain 

rule for total probability: For two events A, B: P(A, B) < The above follows from bounding the complementary error 

mm(P(A), P(B)). The last inequality follows since P(\d\ > function by the following: J°° fi(dx) < J°° ^fi(dx), for q > 



when /i is a zero-mean Gaussian measure. In the above 

! ). The left hand side 



"b) > P(\ J2i=o 1 a 1 ld *\ > «b) for any a B G 

Now, let us consider to = ft - 1. In this case, (f23j follows, derivation a 1 ' 2 = E[d\] |a|~ 2 /(l - \a\ 
where in the last inequality we observe that |a; | < 2 R '~ 1 A Q , of (|27]i can be further upper bounded by, for any r > 0: 
since the state is zoomed at this time. 

In bounding the stopping time distribution, we will consider 
the condition that 



(ft - 1) - fa-! + l)(log (H+s) (P^-)) > a A (s k -i + 1) (24) 

W a 

for some arbitrarily small but positive a A , to be able to es- 
tablish that 

771—1 



M r (A )r 
1 - 1 



{Cm-(am+l)(log (|,| +J ) i 1 
a| 



'-))> aA (s m + l)} 



i _ y ( i£i+£)(c«-c»)«( " )(«-.»)n (1 _ 2 -fl') i > o (25) 



with M r (Ao) — > as Ao — > oo exponentially: 

M r (A ) 



and that ( J T^p) (a - 1_Sfc - l)n (|^j) (s *- l) " > 2 for sufficiently 
large n. Now, there exists an to such that 



->0, 



(28) 



(29) 



|a + <5| 



for any p € N+, due to the exponential dependence of 127] ) in 
A . Thus, combined with ( f24l . conditioned on having s^-i 
errors and a Type I-A error at time 0, we have the following 
bound on 



>c- 



\(Cfc-i-Sfc-i)rw 



(sfc-l)n 



and for this to, 

m— 1 



M r (A )r 



|a|+£,(fc-l- 



_ v Joj + 5 "(C»-U-(iog^ 1 (^))(^ 



" 1 {C,- 1 <^^} 



(30) 



with 



> 



a-E( j 



■)-""'") > 0. (26) 



logcio^M) (^£) + QA 



(31) 



P\ n (\ h rnn\ > l)|Type I-A error at time ) 

^m=l ' 

(fc— 1 / mti-1 «. «. 

f| (|a m "(xo+ E a- t - 1 (^+^))l > 2 Ji '- 1 (|a| + ^) (m ^ m_1) "a (1+Sm) "Ao)) 
m=l \ i=0 ' ' 

(fc-2 / , V Cm™— 1 m — 1 

(J ({**-! =P> fll D (l« Cm "(^o+ E a- 1 - 1 <i 1 +^aH-- 1 )" tl(( , +1)n _ 1 )| 
p=0 ^ m=l i=0 i=0 

> 2 R '- 1 (\a\ + 5)(U-«m-l)n a (l+» m )n Ao j| 



fc-2 



fc-2 



m— 1 



p=0 \ " / ^m=l i=0 i=0 



> 2 ii '- 1 (|a| + 5)(Cm-«m-l)n a (l+« m )n Ao 



Sfe-1 =P 



(20) 



/ Cm™— 1 TO— 1 s n 

p| |a^"(x + E ^ 1 ^+E a(_C " 1 Nc I +i)n-i)l> 2fl ^ 1 (l a l+ 5 ) (Cm_Sm_1) "« (1+S " ,) " A ol^-i=p) 

to=1 ^ i=0 i=0 ' J 

( P / Cmn—1 

<Pi p| I I E ^"^l > 2 i? '- 1 a- c '" n (|a| + 5)(Cm-»m-i)n a (i+« m )n Ao 

^ m=l ^ i=0 

m—1 

-Ixol-lE^^Nc + Dn-ll 



m—1 



i=0 



Sfe-1 = P 



< min P(| E «- 4 - 1 d,|>2 ii '- 1 (^^) (<:m ^ m_1) "(A) (1+Sm)nA o-|^|-(2 ii '- 1 -l/2)Ao 
o<m<s fc _i I \ f — * a a 

V \ ? _Q II II 



m — 1 



£ | a |n(]£[+*)(C*-*-l)n ( « ) ( Sm +l)n (2 fl'-l _ 1/2)Aq 



Sfc-1 =p 



< min 

0<m<s fc 



-l)n 



(n) (1+Sm) "Ao - \x \ - (2 R '~ 1 - 1/2)A 



m — 1 



^(j£l + *)(Cm-»m-l)n (l ^)(l+. m )n (2 fl'-l _ 1/2)Ao 



Sfe-1 =p 



(21) 



(22) 



Sfe-1 =P 



P(\d\ > 2 *'-l(M±*)(U-m-l)«( " )(l + «m)». _ j^j _ ( 2 *'-l _ l/ 2 )A 

V M l a l 

_^(j£j^)(U-m-l)n ( ^. ) (l+. m )n (2 fl'-l _ 1/2)Aq 

= p(V| > 2fl ' _1 (^|^) (Cm ^ ) "(j^) (Sm) "(^^)"( 1 - E 1 (^|^) (C ^ Cm) "(j^) (s, ^ m) "( 1 - 2^'))a 
-|*o| - (2*'" 1 - 1/2)A 



Sfe-1 =p 



-2 fl 'A 



Sfe-1 =p 



(23) 
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We observe that the number of errors needs to satisfy the 
following relation for the above bound in d28l ) to be less than 
1: 

k-1 > (l + s fc _i)//s 

Finally, the probability that the number of incorrect trans- 
missions exceeds n(k — 1) — 1 is exponentially low, as we 
observe in the following. Let, as before, P e {n) = Pg\z( n )- 
We consider below Chernoff-Sanov's Theorem: The sum of 
Bernoulli error events leads to a binomial distribution. Let for 

i > C > o, D(c,P e ) = Ciog(CAPe) + (i - C)iog(rf£). 

Then, the following upper bound holds 1 19 1, for every fc > 3: 

, fc-2 

M E 1 {Type II Error} > «(fc - 1) - 1 



fc-2 



= P [ E 1 {Type II Error} ^ ( k ~ 2 )( K - f^Z \ 



< e -(k-2)D(( K -20l),P e ) 



(32) 



Hence, 



fc-2 



P [ E X {Type II Error} ^ ( K 



1 — K 

k-2 



)(k-2) 



< ( e ^((«-fc5)) Pe ( K -^y-2 ) (33) 
with H(z) = — z log(z) — (1 — z) log(l — z) < 1. Hence, 



fc-2 



P ( E X {Type II Error} > ( K ~ 7— - 2 ) 



t=i 



Tf=5 )\fc-2 



<{eP y e ^'f~\ (34) 
We could bound the following summation as follows. 

L«(fe-i)j-i 

E 

Sk-i=0 



h — 9\ ~ (*-l)-(*fc-i+l)/« I'" 

)M r (A )r 

Sfc-l/ 



,L(«-fef)(fe-2)J 
<M I .(A )(1-P e ) fc - 1 ( ^ 



(35) 



fc-2 

Sfc-l 



x( _^_ r (fc- 1) - 1 



< M r (A )2( fe - 2 )(P e ) (K -^ )(fc - 2) 
= Af J .(A )(2P e ( ^ fc5) )( fe_2) 



(36) 



(37) 



where (f35T> - (l36T> holds since r can be taken to be r > ( 1 p Pe ) K 
by taking Ao to be large enough and in the summations Sfc_i 
taken to be «(fc — 1) — 1. We also use the inequality 



L«(fc-i)j-i 



E 

s=0 



fc-2 



< 2 



fc-2 



(38) 



and that n(k - 1) - 1 < fc - 2. 

Thus, from d20l > we have computed a bound on the stopping 
time distributions through (f34t and (f3Tb . Following similar 
steps for the Type I-B error and no error cases at time 0, 



we obtain the bounds on the stopping time distributions as 
follows: 

• Conditioned an error in the granular region (Type I- A) at 
time 0, the condition for the number of errors is that 

k-1 > (1 + Sfc-x)- 

K 

and by adding d33l and d37l ), the stopping time is domi- 
nated by: 



P(ti > kn) 

< M r (A )(2P ( 

< S(A )(eP ( 



( K -t^t)-j(fc-2) + j- e p e ( K -^l)^fc-2 



7 



(39) 



for 3(A) = M r (A) + 1 which goes to 1 as A — > 00. 

• Conditioned on the error that Z is the decoding output 
at time 0, in the above, the condition for the number of 
errors is that ^ 

fc - 1 > S fc _!- 
K 

and we may replace the exponent term (n — xEf ) with 
(k + -fr^) and the stopping time is dominated by 

P(n>kn) < S(A )(eP e (K+ ^V fe " 2) (40) 

for 3(A) = M r (A) + 1 which goes to 1 as A ->■ 00. 

• Conditioned on no error at time and the rate condition 
R' > log 2 (|a|/a), the condition for the number of errors 
is that 



fc - 1 > 1 



Sfc-l 



and we may replace the exponent term (k — zEf ) w ith 
K. 

The reason for this is that, \xq — xq\ < Ao/2 and the 
control term applied at time n reduces the error. 
As a result ( 1221 writes as fill , in this case. Since 2 R ' ~ 1 ( A ) n > 
1, the effect of the additional 1 in the exponent for (A) 



can be excluded, unlike the case with Pjj e above in (|23 
As a result, the stopping time is dominated by 



P(n>fcn) < 5(A )(eP e K ) fc 



fc-2 



(42) 



for 3(A) = M r (A) + 1 which goes to 1 as A -> 00. 
This completes the proof of the Lemma. 



C. Proof of Theorem \2.2\ 

Once we have the Markov chain by Lemma 15. U and the 
bound on the distribution of the sequence of stopping times 
defined in dl3T ) by Lemma IBT21 we will invoke Theorem |7.2| or 
Theorem 17.31 with Lyapunov functions V^A) = log 2 (A 2 ), 
f(x, A) taken as a constant and C a compact set. 

As mentioned in Remark |2~2l for a DMC with block length 
n Shannon's random coding method satisfies: 

P e (ri) :— max P(c' 7^ c\c is transmitted) 

ce{l,2,...,M(n)} 

< e -nE{R)+o{n) 
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p Cm n—1 m — 1 

P[ f](\a^ n (x + E «^ 1 4+E flK " 1 Nc+i)-i)l 

— 1 i=0 i=0 



m—1 

\xo - xq\ - 



> 2 fl/ - 1 (|a| +<5) (C '"- S "^ 1),l a (1+Sm)n |s fc _i = pj 

p[ \ y a— 1 ^! >2 i? '- i (4^) (Cm ^ m ^ 1)n (A) (1+Sm)n 

i= o 

E |a|«(M±^)^-- 1 (^)(-+ 1 )™(2«'- 1 - 1/2)A ) } 

< min (pf|J| > (2«'-i(H+*)(C™-»-i)»( _ l/2)A 

o<m<s fc _i (_ \ |a| |a| 

" E ( M±i)«-^-«"( " )(1 +Si )" (2 ii'-1 _ 1/2)Ao > ) | 

S m m J) 

-'£ 1 ( 2H "Hn)"(4 : ^) < ^*^ 1, *'(A)*"'(i-2- ,i ')A ) N 



< min 



(41) 



with c! being the decoder output. Here — ^ as n — > oc 
and E(R) > for < R < C. Thus, by Lemma E2 we 
observe that, 

oo 

£[ti|z ,A ] = E F ( T i ^ fc ) ^ K A (n) < oo,(43) 
fe=i 

for some finite number K' Aa (n). The finiteness of this expres- 
sion follows from the observation that for k — 2 > — -, the 

exponent in e^ 7 ^^ 7 ^^^^ ~^ becomes negative. Fur- 
thermore, K' A() (n) is monotone decreasing in Ao since M r (A) 
is decreasing in A. 

We now apply the random-time drift result in Theorem 17.21 
and Corollary 17.11 below. First, observe that, the probability 
that t z+ i ^ r z +n, is upper bounded by the probability below: 



g\g 

f-(l - nia - Ph Q )2p(d> (2 R '(^) n - l)A /2 



+2P| |fl P d > (2 H -*(\a\ + S) n )A - \a n x \ 



< P. 



g\g 



+(1 - P fl % - P|| ff )2PU > (2 R '( —) n - l)A /2 

+P|, Wd > 2«'- 1 ((|a| + S) n - |a|")A n 
=:T(A T0 ) (44) 
Observe that, provided that R'(n) > nlog 2 (|a|/a), 
lim T(A T0 )=P; ig . 

Ao— >oo ala 

We now pick the Lyapunov function V(x, A) = log 2 (A 2 ) 
and /(cc, A) a constant to obtain (1461 1. where x > is an 



arbitrarily small positive number. In d45l ) we use the fact that 
zooming out for all time stages after t 2 + n provides a worst 
case sequence and that by Holder's inequality for a random 
variable X and an event A the following holds: 

E[X1 A ] 



<(E[\X\^])Tk(E[l A - 
= {E[\X\ 1+x })rh(P(A)y 



(47) 



Now, the last term in d45l ) will converge to zero with n large 
enough and A Ti — > oo for some \ > 0, since by Lemma [572] 
P(t z +i = t z + kn) is bounded by a geometric measure and 
the expectation of {{t z +\ — t z — 1) log 2 (|a| + 8)) 1+x is finite 
and monotone decreasing in Ao. The second term in (l46l > is 
negative with Pjri g sufficiently small. 

These imply that, for some sufficiently large F, the equation 

£[log(A^ +1 )|A T2) /i T J < log(A^) - &o + 6il { |A T .|<f} (48) 

holds for some positive bo and finite 6j. Here, b\ is finite since 
K'(n) is finite. With the uniform boundedness of d43l ) over the 
sequence of stopping times, this implies by Theorem 17. 3 1 that 

{(x, A) : |A T J < F, \ 2R , X _ 1A \ < 1} is a recurrent set. □ 



D. Proof of Theorem I2.il 

The process (xt n , At„) is a Markov chain, as was observed 
in Lemma [5T| In this section, we establish irreducibility of this 
chain and the existence of a small set (see Section IVII-Bb to 
be able to invoke Theorem 17. 21 in view of d48l . The following 
generalizes the analysis in ]97l and l94l . 

Let the values taken by 

log 2 (Q(A ))/s 
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E[log(A 2 Tz+i )\x Tz ,A Tz ] 

= £[log(A^ +1 )(l { Typ e i_a error at r 2 } + 1 {Type I-B error at t z } + 1 {no error at t z })\ x t z , A T J 



< (1 - P 



Z\g 



P s e |g ) nlog 2 (a)+nP[log 2 ((|a + £) 



2(^ + 1-1)- 



1 {r 2+ i>r 2 +«})|no error] 



+Pz\ g (nbS2(\*\ + S ) +^[log 2 ((|a + 5) 2 ^+ 1 - 1 ))l K+1>Ti+ „ } )|Type I-B error] 
+P 9 e | Ynlog 2 (a) +n^[log 2 ((|a + J) 2 ^+ 1 - 1 ))l K+1>Ti+ „ } )|Type I-A error] 



+ log 2 (A?J 

log(Ai ) + J (1 - P|i ) log 2 (a) + PI,, log 2 (|a| + 5) 



+riP 



log 2 ((|a + fl 2 < T *+ 1 - 1 >)l {T , +1>T , +n} ) 



< log(A 2 ) + n (1 - P|i ) log 2 (a) + PJ |9 log 2 (|a| + 5) 



P(r z+ i >r,+n)) *n( ^ P(r z+ i = r z + kn)((k - 1) log 2 (|a| + S)) 1+x 



k=2 



< log(A 2 J + n (1 - P| k ) log 2 (a) + P§,, log 2 (|a| + 5) 



+(T(A t J)tW ]T P(r, +1 = t z + kn){{k - 1) log 2 (|a| + 8)) 



i+x 



fc=2 



(45) 



(46) 



be {— A,0, B}. Here A, B are relatively prime. Let h Zg ^ B 
be defined as 

{n e N,n > Iog 2 (i')/« : 3iV A) /y B) n = -N A A+N B B+z Q }, 

where z = log 2 (A )/s is the initial condition of the param- 
eter for the quantizer. We note that since A, B are relatively 
prime, by Bezout's Lemma (see [1|) the communication class 
will include the bin sizes whose logarithms are integer multi- 
ples of a constant except those leading to A < L'\ Since we 
have A (t+1) „ = Q(A tn , c' (t+1)n _ 1 )A tn , it follows that 

log 2 (A (t+ i )n )/s = log 2 (Q(A tn ,c / (t+1) „_ 1 ))/s+log 2 (A tri )/s, 

is also an integer. Furthermore, since the source process {xt n } 
is "Lebesgue-irreducible" (the system noise admits a proba- 
bility density function that is positive everywhere), and there 
is a uniform lower bound L' on bin-sizes, the error process 
takes values in any of the admissible quantizer bins with non- 
zero probability. Consider two integers k, I > log2 j L 1 . For 
all I, k € L^ o ^ g, there exist Na, N b G N such that I - k = 
—NaA+NbB. We can show that the probability of Na occur- 
rences of perfect zoom, and Nb occurrences of under-zoom 
phases is bounded away from zero. This set of occurrences 
includes the event that in the first Na time stages perfect- 
zoom occurs and later, successively, Nb times under-zoom 
phase occurs. Considering worst possible control realizations 



and errors, the probability of this event is lower bounded by 
(p(d& [^'W-^-larL'^")- 1 !/ - \a\ n L']j 

x fpfd e [-(a n 2 R ' - a n )L'/2, {a n 2 R ' - a n )L' /2]) 

\ N A 

x(l-P e )J >0, (49) 

where d — Y^lZn aTl ~ l ~ lw i is a Gaussian random variable. 
The above follows from considering the sequence of zoom- 
ins and zoom-outs and the behavior of a n (xt n ~ Xtn) + d. 
In the above discussion, P e (Z\i) is the conditional error on 
the zoom symbol given the transmission of granular bin i, 
with the lowest error probability (If the lowest such an error 
probability is zero, an alternative sequence of events can be 
provided through events concerning the noise variables leading 
to zooming). Thus, for any two such integers k, I and for some 
r > 0, P(log 2 (A (t+r) „) = Is | log 2 (A tn ) = ks) > 0. 

We can now construct a small set and make the connection 
with Theorems 12.21 and 17.21 Define 

C x xC' A = {(x, A) : L' < A < P, \h\ < 1, log2 J A ) e z} 

We will show that the recurrent set C x x C' A is small. 

Towards this end, we first establish irreducibility. For some 
distribution K, on positive integers, £cl and A an admis- 
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sible bin size, 



K(n)p((x n ,A n ) e{Ex {A}) Uo.Ao) 

>K AoA ^(E,A) 

Here Ka ,a> denoting a lower bound on the probability of 
visiting A from Ao in some finite time, is non-zero by d49b 
and ip is a positive map as the following argument shows. 
Let t > be a time stage for which A tn = A and thus, 



+ bu 



(i-X)T 



< |a| n A 



,/2 = 

(|a|/a)"f . Thus, it follows that, for A 1 ,B 1 £ E, Ai < Si, 



with |/i (t _i)„| < 1: |aaf(t-i) r 

\n A 



P \x tn G [Ai , Si] |a n a!( t _i) n + 6m 



*(t-l)r 



< |o|A (t _i )n /2 



P( dt_i G [Ai - {a n X( t _i) n + &M( t _i) r 

, Si - (a™X( t _i)„ + &ti( t _i)„)] 
|a n £(t_i)„ + frU(i-i)„| < |a|"A (t _i)„/2^ 



> min 

\z\<±{\a\/ay 



P{dt-! £ [A 1 -z,B 1 -z] 



(50) 



Thus, in view of ( f50b . ^ satisfies for a < f3, a, (3 £ M, 
^([Ai.BiJ.A) 

P(d t _i g [Ai -z,Bi -z] ) > 



> min 

2|<f(|a|/Q)" 

The chain satisfies the recurrence property that 

P(x,A){rc x xC A < oo) = 1, 

for any admissible (x, A). This follows from the construction 
of 

9 fe (A,z) := P(Ti >kn\x, A), 

where 

71 = inf(fcn > : |i fe | < 2 fl '~ 1 A fc ,x = x, A = A) 

and observing that Ofe(A, x) is majorized by a geometric mea- 
sure with similar steps as in Section [V-BI Once a state which 
is perfectly zoomed, that is which satisfies \xt \ < 2 R _1 A t , is 
visited, the stopping time analysis can be used to verify that 
from any initial condition the recurrent set is visited in finite 
time with probability 1. 

We will now establish that the set C x x C' A is small. By 
Theorem 5.5.7 of [62|, under aperiodicity and irreducibility, 
every petite set is small. To establish the petite set property, 
we will follow an approach taken by Tweedie ll88l which 
considers the following test, which only depends on the one- 
stage transition kernel of a Markov chain: If a set S is such 
that, the following uniform countable additivity condition 

lim supP(x, Bk) = 0, 

fc->oo xeS 

is satisfied for every sequence Bk I 0, and if the Markov chain 
is irreducible, then S is petite (see Lemma 4 of Tweedie 
and Proposition 5.5.5(iii) of Meyn-Tweedie [62]). 



Now, the set C x x C' A satisfies the uniform countable addi- 
tivity condition since for any given bin size A' in the countable 
space constructed above, (IBTI l holds. This follows from the fact 
that the Gaussian random variable d satisfies 

lim sup P(d £ A k ) = 0, 



dec 



uniformly over a compact set Co, for any sequence Ak I 0, 
since a Gaussian measure admits a uniformly bounded density 
function. Hence, C x x C' A is petite. 

Finally, aperiodicity of the sampled chain follows from the 
fact that the smallest admissible state for the quantizer, L 1 can 
be visited in subsequent time sampled time stages, since 



P{d£ 



-2 fl '- 1 i7|al 



L' -2 Jl '- 1 L7lor 



L']) > 0. 



Thus, the sampled chain is positive Harris recurrent. 



E. Proof of Theorem \2.4\ 

By Kolmogorov's Extension Theorem, it suffices to check 
that the property holds for finite dimensional cylinder sets, 
since these sets generate the cr— algebra on which the stochas- 
tic process measure is defined. Suppose first that the sampled 
Markov chain is stationary. Consider two elements: 

P(a; t+ i+„ £ Ai,x t+ 2+n £ A 2 ) 

P{dXy t+x+n ^ n , Xt+l+n G Ai,X t+2+n G An) 



P(x t +l+n G Ai,X t+2+n G A 2 \x^ i +±+» jn) 

ijn 

xP(dx li± i±n jn ) 
P(x t +i £ Ai,x t+ 2 £ A 2 \ X yt±i^ n )P{dxyt±±^ n ) 



The above follows from the fact that, the marginals P(dxyt+i_ , ) 
and P(dx< «+i+a j ra ) are equal since the sampled Markov chain 
is positive Harris recurrent and assumed to be stationary, and 
the dynamics for inter-block times are time homogeneous Markov. 
The above is applicable for any finite dimensional set, thus 
for any element in the sigma field generated by the finite 
dimensional sets, on which the stochastic process is defined. 
Now, let for some event A, T~ n A = A, where T denotes the 
shift operation (see Section [VII- At . Then 

P(A) = lim P(A n T- kn A) = lim P(A)P{T- kn A\A) 

k— >oo k— >oo 

Note that a positive Harris recurrent Markov chain admits a 
unique invariant distribution and for every xq £ K, 

lim P{x kn G A\x ) = n(A), 

where ir(-) is the unique invariant probability measure. Since 
such a Markov chain forgets its initial condition, it follows 
that for A = T~ n A: 



P{A n T- kn A) = P{A nA) = P(A)P(A), 
thus, P(A) £ {0, 1}, and the process is n— ergodic. 



□ 
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lim sup P((x(t+i)n, A( t+ i )n ) G x A')|x 4n = x,A tn = A) 



lim sup PI (d,Au + u n ) e [ (B k - (a n x tn + buu+i) n -i)) x A' 



x tn = x,A tn = A) =0. (51) 



P Proof of Theorem 12.51 For the second term in (l53l i. the convergence of the first 

We begin with the following result, which is a consequence expression is ensured with lim™ P^ g (\a\ + 5)W^ n a 2n 

of Theorem FPl ar, d mat Pe(\a\ + 5) (2 / K ^ n — > as n — > oc. By combining 

Lemma 5.3: Under the conditions of Theoreml23l we have the second and the third term, the desired result is obtained. 



that, if for some 7 > 0, b < 00, the following holds To show that lim™-^ E[x 2 nn ] < 00, we first show that for 



(n/n)-l 
jE[ E A Ll*0,A 

fc=0 
.2 p[A2 



some k > 0, 



(n/n)-i 



< A -S[A ri |x ,A ]+W {(Ao , Me(CixCh)} , £ 4 m | So ,Ao] < A^ 2 ^'- 1 ). (54) 



then, lim^oo £7[A£J < 00. □ m=0 

Now, under the hypotheses of Theorem 12.31 and observing 
that Type I-B and I- A errors are worse than the no error-case Now, 
at time for the stopping time tail distributions, we obtain d52l 
for some finite £1. In d52l ), we use the property that H(k) < 1 
and (O-dUl). 

We now establish that 



r E[A 2 T1 Isq.Aq] 

urn -7. < 1. 771 

A ^oo A 2 , = -E 



This is a crucial step in applying Theorem 17.21 

Following similar steps as in ( 1521 ), the following upper bound 

on [E[A\ I a; ,A ]/Ag) is obtained: 



(n/n)-l 

E \ E X mn\ X 0,&o] 
m=0 

(n/n)-l , tn-1 trj-1 

E a 2tn ( (s + £ «" <_1 ^) + ( E 

t=0 ^ i=0 z=0 

a?0) A 



(n/n)-l tn-1 

a-Pg\ g -P e z\ 9 ) ^ 2E i E « 2t "(^o+ E a^H) 2 |zo,A ] 



t=0 t=0 

tn-1 



/ 1 \ t=0 

x ( a 2n + — ^£'[A^ 1 l{ Tl>ri }|no error at time 0] J 

V / 7 +2£[ E """(E^Vl^oA], (55) 

+ (P g e | g )U 2 "(l + (|a|+<5) 2 " + ...(| a |+ ( 5) 2 (LiJ)™) t=o 4 =o 



+ J2 e k ~ 2 Pe K(k 1 ""(M +<5) 2(fc ~ 1 ~» )n (a) 2 ™ which follows from the observation that for X, Y random 

fe=|-il+i variables, E[(X + Y) 2 ] < 2E[X 2 } + 2E[Y 2 ]. 

x(\a\ + (5) (2/K)n "(A )^ Let us first consider the component: (a* (x +J2i?Lo 1 a" 1 ' 1 ^)) 2 . 

-n\ g ((H + S) 2n + 5(A ) 1 f pffi J + j) 2B ) ( 53 > f-/")" 1 



£[ E (a*"(^o+ E a^ 1 d l )) 2 |^o,A ] 



i=0 



Note now that 

Km Q P(n > n\no error at time 0,x ,A o ) = 0, = E ^ l {t<Tl/n} (a tn (x Q + E a - * -1 *)) 2 K A„] 



t=0 i=0 

■DC 



uniformly in xo with |/iq| < 1 and given the rate condition 
R'{n) > ralog 2 (|a|/a) by (©. Therefore, the first term in < ^ (E[(l {t<Tl/n} ) 1+x \x ,h Q 

(l53l to in the limit of large Ao, since lim„^oo k— log(P e ) + 
21og 2 (|a| + 5) < and we have the following upper bound 



l+X 



\a\ +5) 2n J2( eP e) k ~ 2 (\a\ + 8) 2n{k - 2) {a) 2n ) < 00 



x (E[(a tn (x + E a~' i ~ 1 ^)) 2(i ^ ) |a ; o,Ao]J(56) 



fc=2 

for sufficiently large n. for some x > 0, by Holder's inequality. 
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(n/n)-l 
E[ A?„|x ,A ] 

t=0 

, oo (i-1) 

^ ^l p l\9 E p ( r i = ^i T yp e : - A error ) E + ^) 2(fe ~ 1)n « 

^ 1=2 k=l 

I 

\2kn 



2n 



f (JO i — L 

+ A l P zJ E P ( T i = Zn |Type Z - B error) ^(|a| +<5) 2 

^ i=2 k=l 

(OO ?— 1 s 

J2 P(ri = ln\no error at time 0) £(|o| + ^^"a 2 " ) 
;=2 fc=i ' 



+A 2 P(n = n) 

< A 2 P; |g f f>( Tl = Zn|Type I-A error) M + a 2 " 



?=2 

oo 



(|a| +(5) 2 » - 1 

(|a|+<5) 2 (^ 



(OO 
P(n = HType I-B error)(|a| + S) (|&| + &)2n x 

( °° (la + (5) 2 ( ( - 1 )" 

+A 2 (1 - P s e |s - Pl |g ) ^ P ^ = errOT at time °) (| a | + / ) 2n_ 1 Q 

+A 2 P(r 1 - n) 



< A 2 P e 



(|a| + <5)( 2 / K )™„ 



-S(Ao) ( ^(e('- 2 ))P e (K)(i - 1 -^ ) (|a| + ^"^V" ) 



^(|a| + <5) 2 ™ 

+A§(1 - P g % - P&„)(|«| + of "5(A o) ( g (e P e -)'- 2 ^ ± g""^ a 2 ") 
+A 2 P(n = n ) 

< CiAo (52) 



Moreover, for some P 2 < oo, 

tn-l 
. (I 



P[fl 2 



= a 



»(x + J2 a^ 1 ^) 2( ^ ) |xo,A ] 



tn-l 



i=0 



< la 



2tn( 



i+x- 



XP[( 



^[(so + ^a-^diJ^lso.Ao] 

i=0 

l (2 i? '- 1 A ) 

I °^ a ^ )^|xo,A ] 



2*'-iA 
= |a| 2t "( i ?)(2«'- 1 A ) 2i ? 

<B 2 (2*- 1 A ) 2l ?|a| 2tn <4'>, 



V. 



(57) 



where the last inequality follows since for every fixed \ h \ < 1, 
the random variable /i + (ESo a - * -1 ^)/^' -1 A ) has a 
Gaussian distribution with finite moments, uniform on A > 



Thus, 



(Ti/n)-l tn-l 

E[ a2tn ( x o+ E a"" 1 ^) 2 No,A ] 

t=0 i=0 



t=0 



i+x 



< E( p [( 1 {*<^M) 1+X l^o,A ]^ 

B2( 2«'(«)-iA ) 2 ^|a| 2t "( J 



t=o 



£(s(Ao)(eP ( 



(«-fef) 



t-i 



P 2 1+ *A 2 |a| 2tn 



°° / 1- \ " 

£(s(Ao)(ePe (K -^V->r (1+x) ) 



t=0 



R 1+x A 2 

a 2 ^0 



(58) 
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for some finite (b 2 (f° r a fixed finite n). In the discussion 
above we use the fact that we can pick \ > sucn mat 
(P e ) K \a\ 2n ( 1+x ^ < 1. Such a x exists, by the hypothesis that 



Following d53"l l. the only term which survives is 



lim„^ 00 P f f(|a| + (5) 2 " = 0. 

We now consider the second term in ( |55T >. Since Ui is the 
quantizer output which is bounded in magnitude in proportion 
with Aj, the second term writes as: 

(ri/n)-l tn-1 

t=0 i=0 
(ri/n)-l (_1 

<E[ J2 a^C^a-^'-^A^f^Ao] 

(ri/n)-l t-1 

<E[ cL 2tn {J2a~ m 2 ( - R '- 1) (\a\+ S) m A ) 2 

t=0 i=0 

\xo,A ] 

(Ti/n)-l - I I i r \ 2 

<£[ E a 2 *^2(«'^)(^p) t »A J |* ,A ] 



t=o 



(l_(J^£)«) 



< A 2 Cs 



(59) 



for some finite (b, by the bound on the stopping time and 
arguments presented earlier. 

Now, with ((53), (ED, (1581159b , we can apply Theorem |7!2l 
With some < e (whose existence is justified by (l53l). 

<5(x, A) = eA 2 , /(x, A) = '—^x 2 , 

Kb,+Kb 

C a compact set and V(x, A) = A 2 , Theorem l7. 21 applies and 
lim t _ ! . 00 -E[x 2 n ] < oo. 

Thus, with average rate strictly larger than log 2 (|a|), stabil- 
ity with a finite second moment is achieved. Finally, the limit is 
independent of the initial distribution since the sampled chain 
is irreducible, by Theorem 12.31 Now, if the sampled process 
has a finite second moment, the average second moment for 
the state process satisfies 

- JV-l ^ n-l 

J im w^tE x l] = -^E^i^^oi' 

k=0 k=0 

is also finite, where E v denotes the expectation under the in- 
variant probability measure for Xq, Aq. By the ergodic theorem 
for Markov chains (see Theorem 17. 21 >. the above holds almost 
surely, and as a result 



1 



lim — | > xt 

k=0 



N-l 



n-l 

^tE X k\ X 0i ^o] < 0° OS- 
k=0 



G. Proof of Theorem \2.6\ 

Proof follows from the observation that the number of errors 
in channel transmission when the state is under-zoomed, s, is 
zero. No errors take place in the phase when the quantizer is 
being zoomed out. 



P 9 e |9 [a 2 " 



(i + (\a\ + 5) 2n + ...(\a\ + s) 2 ^ n ^ 



which is to be less than 1. We can take k > 1/2 for this case. 
Now, 



lim P(t > kn\xo, Aq) 

A— >oo 

^> (M+ ?r*°" ^-) 



0, 



for k > ~. Hence, ]im n - yoo P^ g (\a\ + S) 2n — > is sufficient, 
since 2 > -. The proof is complete once we recognize P e as 

P„ e ,„. 



H. Proof of Theorem \4.2\ 

We provide a sketch of the proof since the analysis follows 
from the scalar case, except for the construction of an adaptive 
vector quantizer and the associated stopping time distribution. 

Consider the following system 



x t+l 
x t+l 


= A 


'i 


_ t+l_ 







Bu t + Gd t , 



(60) 



where A = Diag(A') is a diagonal matrix, obtained via a 
similarity transformation: A = U~ 1 AU and B = U~ X B, G = 
U~ X G, where U consists of the eigenvectors of the matrix A. 
We can assume that, without any loss, B is invertible since 
otherwise, by the controllability assumption, we can sample 
the system with a period of at most N to obtain an invertible 
control matrix. 

The approach now is quantizing the components in the sys- 
tem according to the adaptive quantization rule provided ear- 
lier, except for a joint mapping for the overflow region. We 
modify the scheme in © as follows: Let for i = 1,2, ... ,n, 
R'iin) = log 2 (2 fl *(") - 1) = log 2 (Ki(n)). The vector quan- 
tizer quantizes uniformly the marginal variables and we de- 
fine the overflow region as the quantizer outside the granu- 
lar region: Hf =1 [-2 R * ( - n ^- 1 A i ,2 R ^- 1 A i ] and for i = 
1,2,. ..,N 

N 

Q^\{x) =Z if x £ Y[[-2 R *W- 1 A h ,2 R *W- 1 A k \ 

fc=i 

and for x £ T^Ii [-2^(™)- 1 A l , 2 H '»~ 1 A 1 }, the quantizer 
quantizes the marginal according to ©. Hence, here A 1 is the 
bin size of the quantizer in the direction of the eigenvector x l , 
with rate i?-(n). For 1 < i < N: 

Ut = -l{t=(k+l)n-l}B~ 1 A n Xkn, 

x t = QkI i x t)i 

Aj+^A^A^^), (61) 
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with S i > 0, a 1 < 1 and L 1 > such that 

Q\A\c') = (|A*| +«J) n if c' = Z 

g 4 (A 4 ,c') = (a 2 )™ if c'^Z,A>L\ 

Q l (A\c') = l if c'^Z,A<L\ 

and E<(n) > ralogaflA*!/"*)- 

Instead of ( fT3T >. the sequence of stopping times is defined 
as follows. With r = 0, define 

t z+1 = infjfcn > t z : \h l kn \ < l,i = 1,2.. .,iV}, fc,z 6 Z + , 
where hi = — x *, . . Now, we observe that N— dimensional 

4 Aj2 R i" 1 

system: 

P(n > fcn^o, A ) 

, fc AT v 

= p ( n-cuci^i > i)} i^o,A j 
^t=i i=i ' 

< ^(U^**! > l)|zoom until k) \ x , A \ (62) 

AT 

< y^P(|fefc T J > l|zoom until fc,a; ,A ) (63) 
i=i 

where we apply chain rule for probability in d62b and the 
union bound in d63b . However, for each of the dimensions, 
P(\h l kn \ > 1 1 zoom until kn, xq,Aq) is dominated by an 
exponential measure, and so is the sum. Furthermore, P{t\ > 
n\xo, Aq) still converges to provided the rate condition R'^ri) 
log^(| A 1 \/a l ) is satisfied for every i, since P{t\ > n\xo, Aq) < 
P(\Ki\ > I^OjAo). Therefore, analogous results to 
(l45ll-(l48Ti are applicable. Once one imposes a countability con- 
dition for the bin size spaces as in Theorem 12.31 the desired 
ergodicity properties are established. □ 

VI. Concluding Remarks 

The paper considered stochastic stabilization of linear sys- 
tems driven by unbounded noise over noisy channels and es- 
tablished conditions for asymptotic mean stationarity. The con- 
ditions obtained are tight with an achievability and a converse. 
The paper also obtained conditions for the existence of fi- 
nite second moments. When there is unbounded noise, the 
result we obtained for the existence of finite second moments 
required further conditions on reliability for channels when 
compared with the bounded noise case considered by Sahai 
and Mitter. We do not have a converse theorem for the finite 
second moment discussion; it would be interesting to obtain a 
complete solution for this setup. 

We observed in the development that, three types of errors 
were critical. These bring up the importance of unequal error 
coding schemes with feedback. Recent results in the literature 
[10 1 have focused on fixed length schemes without feedback, 
and variable length with feedback and further research could 
be useful for networked control problems. 

The value of information channels in optimization and con- 
trol problems (beyond stabilization) is an important problem 



in view of applications in networked control systems. Fur- 
ther research from the information theory community for non- 
asymptotic coding results will provide useful applications and 
insight for such problems. These can also be useful to tighten 
the conditions for the existence of finite second moments. 
Moderate channel lengths (73], CD, El, ED, and possible 
presence of noise in feedback 11221 are crucial issues needed 
to be explored better in the analysis and in the applications of 
random-time state-dependent drift arguments ||94l . 

Finally, we note that the assumption that the system noise 
is Gaussian can be relaxed. For the second moment stability, a 
sufficiently light tail which would provide a geometric bound 
on the stopping times as in (l39i l through (|27T i will be sufficient. 
For the AMS property, this is not needed. For a noiseless 
DMC, ll97l established that a finite second moment for the 
system noise is sufficient for the existence of an invariant 
probability measure. We require, however, that the noise ad- 
mits a density which is positive everywhere for establishing 
irreducibility. 

A. Variable Length Coding and Agreement over a Channel 

Let us consider a channel where, agreement on a binary 
event in finite time is possible between the encoder and the 
decoder. By binary events, we mean for example, synchro- 
nization of encoding times and agreement on zooming times. 
It turns out that if the following assumption holds, then such 
agreements are possible in finite expected time: The chan- 
nel is such that there exist input letters x%, X2, #3, £4 where 
'D(P(-|a!i)||P(-|a;a)) = 00 and D(P(-\x 3 )\\P(-\x 4 )) = 00. 
Here, x\ can be equal to X4 and X2 can be equal to X3. For 
example, the erasure channel satisfies this property. Note that, 
the above condition is weaker than having a non-zero zero- 
error capacity, but stronger than what Burnashev's [14L ||93l 
method requires; since there are more hypotheses to be tested. 

In such a setting, one could use variable length encoding 
schemes. Such a design will allow the encoder and the decoder 
to have transmission in three phases: Zooming, transmission, 
and error confirmation. Using random-time, state-dependent 
stochastic drift, we may find alternative schemes for stochastic 
stabilization. 

VII. Appendix: Stochastic Stability of Dynamical 
Systems 

A. Stationary, Ergodic, and Asymptotically Mean Stationary 
Processes 

In this subsection, we review ergodic theory, in the context 
of information theory (that is with the transformations being 
specific to the shift operation). A comprehensive discussion is 
available in Shields (80) and Gray 1311. 1331. 

Let X be a complete, separable, metric space. Let £>(X) 
denote the Borel sigma-field of subsets of X. Let E = X°° 
denote the sequence space of all one-sided or two-sided infinite 
sequences drawn from X. Thus, for a two-sided sequence 
space if x e S then x — {. . . , x-i,xo,Xi, . ■ . } with Xi £ X. 
Let X n : S —> X denote the coordinate function such that 
X n (x) = x n - Let T denote the shift operation on E, that 
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is X n (Tx) = x„+i. That is, for a one-sided sequence space 
T(xo,xt,x 2 ,...) = (xi,X2,X 3 ,...). 

Let 23(E) denote the smallest sigma-field containing all cylin- 
der sets of the form {x : Xi £ Bi,m < i < n} where 
Bi £ 23(X), for all integers m, n. Observe that n„> T~"23(E) 
is the tail a— field: r\ n a(x n) x n +i) • • • ), since T~ n (A) = {x : 
T n x £ A}. 

Let // be a stationary measure on (E, 23(E)) in the sense that 
/i(T- x S) = //(B) for all B e 23(E). The sequence of random 
variables {x n } defined on the probability space (E, 23(E),//) 
is a stationary process. 

Such a process is aperiodic if (j,({x : T~ n x = x}) = for 
each integer n. 

Definition 7.1: Let P be the measure on a process. This 
random process is ergodic if A = T~ 1 A implies that P(A) £ 
{0,1}. 

That is, the events that are unchanged with a shift operation 
are trivial events. Mixing is a sufficient condition for ergod- 
icity. Thus, a source is ergodic if lim^oc P(A n T~ n B) = 
P(A)P(B), since the process forgets its initial condition. Thus, 
when one specializes to Markov sources, we have the follow- 
ing: A positive Harris recurrent Markov chain is ergodic, since 
such a process is mixing and stationary. We will discuss this 
further in the next section. 

Definition 7.2: A random process is N— stationary, (cyclo- 
stationary or periodically stationary with period N) if the 
process measure P satisfies P(T~ N B) = P(B) for all B £ 
23(E), or equivalently for any n £ N samples t\,t%, ■ ■ ■ ,t n : 

P(x fl £ Ax,x t2 £ A 2 , ■ ■ ■ , X tn £ An) 

= P(x tl+ N £ Ai,X ta +N £ A 2 ,..., X tn +N £ An) 

Definition 7.3: A random process is N— ergodic if A = 
T~ N A implies that P(A) £ {0, 1}. 

Definition 7.4: A set A £ 23 (X) is coordinate-recurrent if 

for some m £ Z + 



2) Birkhoff's ergodic theorem applies for bounded measur- 
able functions /, if and only if the process is AMS ||33ll . 

Let 

1 



F 



N 

{x : lim — f(T l x) exists.} 



It follows that for an AMS process m{F) — 1, with m 
being the stationary mean of the process. Birkhoff's Almost- 
Sure Ergodic Theorem states the following: If a dynamical 
system is AMS with stationary mean m, then all bounded 
measurable functions / have the ergodic property, and with 
probability 1, 



1 



N-X 



N— >QC iV * — J 

i=0 

where E mx denotes the expectation under measure m x and 
m, is the resulting ergodic measure with initial state x in the 
ergodic decomposition of the asymptotic mean ([31] Theorem 
1.8.2): m(A) = J m x (A)m(dx). Furthermore, 

1 

lim -E[J2f(T i x)]=E m [f], x£F, 

i=0 

In fact, the above applies for all integrable functions (inte- 
grable with respect to the asymptotic mean). 

Definition 7.6: A random process is second-moment stable 
if the following holds: 

N-X 



lim ^E[J2(X m (x)) 2 } <oo 



m=0 



Definition 7.7: A random process is second-moment stable 
almost surely if the following limit exists and is finite almost 
surely: 



JV-l 



J" 11 ^ y2( X m(x)) 2 < OO 



{x m (x)&A} — oo, a.s. 



Definition 7.5: A process on a probability space (fi, P, P) 
is asymptotically mean stationary (AMS) if there exists a prob- 
ability measure P such that 

N-X 

^Y. P tT- k F)=P{F), 

for all events F. Here P is called the stationary mean of P, 
and is a stationary measure. 

P is stationary since, by definition P(F) = P(T~ 1 F), for all 
events F in the tail sigma field for the shift. A cyclo-stationary 
process is AMS. See for example ©, J33] or |32) (Theo- 
rem 7.3.1), that is N— stationarity implies the AMS property. 
Asymptotic mean stationarity is a very important property: 
1 ) The Shannon-McMillan-Breiman Theorem (The Entropy 
Ergodic Theorem) applies to finite alphabet AMS sources 
||33l (see an extension for a more general class [[3)). In 
this case, the ergodic decomposition of the AMS pro- 
cess leads to almost sure convergence of the conditional 
entropies. 



B. Stochastic Stability of Markov Chains and Random-Time 
State-Dependent Drift Criteria 

In this section, we review the theory of stochastic stability 
of Markov chains. The reader is referred to Meyn and Tweedie 
ll62l for a detailed discussion. The results on random-time 
stochastic drift follows from Yiiksel and Meyn |94], |96|. 

We let (f) = {<j) t , t > 0} denote a Markov chain with state 
space X. The basic assumptions of [62| are adopted: It is 
assumed that X is a complete separable metric space, that 
is locally compact; its Borel cr-field is denoted 23(X). The 
transition probability is denoted by P, so that for any </> £ X, 
A £ 23(X), the probability of moving in one step from the state 
4> to the set A is given by P{<pt+x £ A \ <p t — <fi) — P(<f>, A). 
The n-step transitions are obtained via composition in the 
usual way, P((f) t +n £ A \ <f) t — <t>) — P n {4'i A), for any n > 1, 
The transition law acts on measurable functions /: X — > K 
and measures // on 23(X) via, 



P(4>,dy)f(y), 4>e 
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liP(A):= / fj,(d<t>)P(<t>,A), A G B(X). 

A probability measure tv on 23(X) is called invariant if tvP 
tv. That is, 



7r(d0)P(0, A) = tv (A), A G B(X). 



For any initial probability measure f on 23 (X) we can con- 
struct a stochastic process with transition law P, and satisfying 
(po ~ i/. We let denote the resulting probability measure on 
sample space, with the usual convention for v = 8^ when the 
initial state is <f> G X. When z> = 7t then the resulting process 
is stationary. 

There is at most one stationary solution under the following 
irreducibility assumption. For a set A G 23 (X) we denote, 



t a := mm(t > 1 : (f>t £ A) 



(64) 



Definition 7.8: Let (p denote a sigma-finite measure on 23 (X). 

(i) The Markov chain is called ip-irreducible if for any 
4> G X, and any P G 23(X) satisfying ip(B) > 0, we 
have 

P^TB < OO} > . 

(ii) A (/3-irreducible Markov chain is aperiodic if for any 
G X, and any B G 23(X) satisfying ^(B) > 0, there 
exists no = no((f>,B) such that 

P n (<p,B) > for all n > n . 

(iii) A (^-irreducible Markov chain is Harris recurrent if 
P<j>{TB < oo) = 1 for any <f) G X, and any £? G 23(X) 
satisfying (ys(-B) > 0. It is positive Harris recurrent if in 
addition there is an invariant probability measure tv. 

Tied to <ys-irreducibility is the existence of small or petite 
sets. A set A G 23(X) is small if there is an integer no > 1 
and a positive measure /i satisfying //(X) > and 

P n °((f),B) > fi(B), for all cf> G A, andBe 23(X). 

A set A G 23 (X) is petite if there is a probability measure 
J on the non-negative integers N, and a positive measure /i 
satisfying /i(X) > and 



n=0 



P n {(j),B)J{n) > n(B), for all (f> G A, and B 6 23(X) 



Theorem 7.1: [[62 1 Thm. 4.1] Suppose that X is a </> 
irreducible Markov chain, and suppose that there is a set A G 
23(X) satisfying the following: 

(i) A is /i-petite for some /i. 

(ii) A is recurrent: P$(ja < oo) = 1 for any x G X. 

(iii) A is finite mean recurrent: sup £^[7-^4] < 00. 



Then X is positive Harris recurrent. 



□ 



Let T z , z > be a sequence of stopping times, measurable on 
a filtration generated by the state process with 7o = 0. 

Theorem 7.2: [94| [96| Suppose that is a (^-irreducible 
and aperiodic Markov chain. Suppose moreover that there are 



functions V: X ->• [0, 00), 6: X -)• [1, 00), /: X -)• [l,oo), 
a small set C, and a constant 6 G K, such that the following 
hold: 

E [V(<j>r z+1 ) I ^"rj < V(0rj - 



T 2+ i-l 



< 



9 Tl 



"J: 



i<t>%) +bl{4 >Tz eC} 

z>0. 
(65) 

Then the following hold: 

(i) <j> is positive Harris recurrent, with unique invariant 
distribution tv 

(ii) Tv(f) :=Jf(<t>) Tv(dc}>) < 00 

(iii) For any function g that is bounded by /, in the sense 
that sup^ \g((j>)\/ f((f>) < 00, we have convergence of 
moments in the mean, and the Law of Large Numbers 
holds: 

lim E^g^t)] = Tv(g) 

t—*oo 

1 N ^ 

N—yoo JV ' — ' 



a.s. , 



□ 



Remark 7.1: We note that the condition / : X — > [l,oo) 
can be relaxed to /: X — > [0, 00) provided that one can show 
that there exists an invariant probability measure. 

We conclude by stating a simple corollary to Theorem l7.2l 
obtained by taking f(<fi) = 1 for all cf> G X. 

Corollary 7.1: [94| 1 96 1 Suppose that is a 95-irreducible 
Markov chain. Suppose moreover that there is a function V : 
X — ^ (0, 00), a small set C, and a constant b G K, such that 
the following hold: 

E[V{^ +1 ) I F%] < V{4> Ts )- 1 + W{^ec } 
sup E[T z+ i - T z I JrJ < 00. 

2>0 

Then is positive Harris recurrent. 



(66) 



□ 



The following is a useful result for the paper. 
Theorem 7.3: [62 1 Without an irreducibility assumption, if 
holds for a measurable set C, a function V : X — > (0, 00), 
with sup^gp V(x) < 00, then C satisfies sup^g^ E[tc] < 00. 

We have the following results. A Positive Harris Recurrent 
Markov process (thus with a unique invariant distribution on 
the state space) is also ergodic in the sense of ergodic theory 
(the ergodic theorem for Markov chains has typically a more 
specialized meaning with the state process being a coordinate 
process in the infinite dimensional space X°°, see [44J), which 
however, implies the definition in the more general sense. This 
follows from the fact that, it suffices to test ergodicity on the 
sets which generate the sigma algebra (that is the finite dimen- 
sional sets), which in turn can be verified by the recurrence 
of the individual sets; probabilistic relations in arbitrary finite 
sets characterize the properties in the infinite collection, and 
that, mixing leads to ergodicity. 
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